AITopics | egocentric video

ef01d91aa87e7701aa9c8dc66a2d5bdb-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsApr-30-2026, 05:56:29 GMT

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Genre: Overview (0.46)

Industry:

Law (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
(2 more...)

Add feedback

EgoEnv: Human-centric environment representations from egocentric video

Neural Information Processing SystemsApr-29-2026, 14:40:33 GMT

First-person video highlights a camera-wearer's activities in the context of their persistent environment. However, current video understanding approaches reason over visual features from short video clips that are detached from the underlying physical space and capture only what is immediately visible. To facilitate humancentric environment understanding, we present an approach that links egocentric video and the environment by learning representations that are predictive of the camera-wearer's (potentially unseen) local surroundings. We train such models using videos from agents in simulated 3D environments where the environment is fully observable, and test them on human-captured real-world videos from unseen environments. On two human-centric video tasks, we show that models equipped with our environment-aware features consistently outperform their counterparts with traditional clip features. Moreover, despite being trained exclusively on simulated videos, our approach successfully handles real-world videos from HouseTours and Ego4D, and achieves state-of-the-art results on the Ego4DNLQ challenge.

artificial intelligence, machine learning, object-oriented architecture, (18 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.68)

Add feedback

Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data

Neural Information Processing SystemsMar-20-2026, 22:26:29 GMT

We study the problem of estimating the body movements of a camera wearer from egocentric videos. Current methods for ego-body pose estimation rely on temporally dense sensor data, such as IMU measurements from spatially sparse body parts like the head and hands. However, we propose that even temporally sparse observations, such as hand poses captured intermittently from egocentric videos during natural or periodic hand movements, can effectively constrain overall body motion. Naively applying diffusion models to generate full-body pose from head pose and sparse hand pose leads to suboptimal results. To overcome this, we develop a two-stage approach that decomposes the problem into temporal completion and spatial completion. First, our method employs masked autoencoders to impute hand trajectories by leveraging the spatiotemporal correlations between the head pose sequence and intermittent hand poses, providing uncertainty estimates. Subsequently, we employ conditional diffusion models to generate plausible full-body motions based on these temporally dense trajectories of the head and hands, guided by the uncertainty estimates from the imputation. The effectiveness of our methods was rigorously tested and validated through comprehensive experiments conducted on various HMD setup with AMASS and Ego-Exo4D datasets.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.86)

Add feedback

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

Neural Information Processing SystemsMar-20-2026, 21:53:35 GMT

Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric HOI, in addition to perceiving semantics e.g., ''what'' interaction is occurring, capturing ''where'' the interaction specifically manifests in 3D space is also crucial, which links the perception and operation. Existing methods primarily leverage observations of HOI to capture interaction regions from an exocentric view. However, incomplete observations of interacting parties in the egocentric view introduce ambiguity between visual observations and interaction contents, impairing their efficacy. From the egocentric view, humans integrate the visual cortex, cerebellum, and brain to internalize their intentions and interaction concepts of objects, allowing for the pre-formulation of interactions and making behaviors even when interaction regions are out of sight.

artificial intelligence, name change, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.61)

Add feedback

ef01d91aa87e7701aa9c8dc66a2d5bdb-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-17-2026, 20:43:04 GMT

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.04)

Genre: Overview (0.46)

Industry:

Law (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
(2 more...)

Add feedback

bd2605c5d854837aaf095537e82f1883-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 20:28:52 GMT

artificial intelligence, machine learning, object-oriented architecture, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.68)
Information Technology > Artificial Intelligence > Robots (0.67)

Add feedback

633b0e871a48d542280c3ad03928e60d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 11:05:45 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

EgoChoir: Capturing 3D H uman-O bject Interaction Regions from Egocentric Views Y uhang Y ang

Neural Information Processing SystemsFeb-15-2026, 08:55:38 GMT

For the egocentric HOI, in addition to perceiving semantics e.g., "what" interaction is occurring, capturing "where" the interaction specifically manifests in 3D space is

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Guangxi Province > Nanning (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Add feedback

6a412f0037b0df295a39a198666ea6a6-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 09:00:09 GMT

machine learning, natural language, recognition, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Italy > Sardinia (0.04)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
(2 more...)

Add feedback

A Self Validation Network for Object-Level Human Attention Estimation

Zehua Zhang, Chen Yu, David Crandall

Neural Information Processing SystemsFeb-11-2026, 23:57:21 GMT

Some recent work [22, 66, 68] has discussed estimating probability maps of ego-attention or predicting gaze points in egocentric videos. However, people think not in terms of points in their field of view, but in terms of theobjects that they are attending to. Of course, the object of interest could be obtained by first estimating the gaze with the gaze estimator and generating object candidates from an off-theshelf object detector, and then picking the object that the estimated gaze falls in. Because this bottom-up approach estimateswhere and what separately, it could be doomed to fail if the eye gaze prediction is slightly inaccurate, such as falling between two objects or in the intersection ofmultiple object bounding boxes (Figure1).

artificial intelligence, machine learning, video, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Indiana (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

egocentric video

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

ef01d91aa87e7701aa9c8dc66a2d5bdb-Paper-Datasets_and_Benchmarks.pdf

EgoEnv: Human-centric environment representations from egocentric video

Estimating Ego-Body Pose from Doubly Sparse Egocentric Video Data

EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views

ef01d91aa87e7701aa9c8dc66a2d5bdb-Paper-Datasets_and_Benchmarks.pdf

bd2605c5d854837aaf095537e82f1883-Paper-Conference.pdf

633b0e871a48d542280c3ad03928e60d-Paper-Conference.pdf

EgoChoir: Capturing 3D H uman-O bject Interaction Regions from Egocentric Views Y uhang Y ang

6a412f0037b0df295a39a198666ea6a6-Paper-Conference.pdf

A Self Validation Network for Object-Level Human Attention Estimation